Google Sheets - Data Cleaning and Transformation
Google Sheets
Icon

This portfolio piece showcases my expertise in using Google Sheets formulas for robust data cleaning and transformation across two separate projects, using simulated data for illustrative purposes.
Key: Use the project tables below, containing linked descriptions, to navigate the corresponding Google Sheets displayed further below, showcasing the transformation of unstructured data. Best viewed in desktop format (enable "Desktop site" in mobile browsers).

Project 1 - Employee Survey

Problem with raw data
(Form Responses 1)
Data Transformation Description
(Processed Data)
Data Transformation Description
(Processed Data - Combined Formula)
Multiple values stored within a single cell, separated by commas (column F), a format unsuitable for most analytical tools. Utilizes a readability and maintainability focused per-column SEARCH formula to split comma-separated values into distinct columns, transforming them to quantifiable data. Same output as the first transformation, but now an efficiency focused single SEARCH formula handles all keyword extractions from the comma-separated values. Suited for relatively limited digital resources.

Project 2 - Consumer Survey

Problem with raw data
(luminaris_survey_data)
Data Transformation Description
(Consumer Survey Processed Data)
Inconsistent text formatting (extra spaces, capitalization issues) in columns A and B. Utilizes SUBSTITUTE and PROPER formulas to clean data by removing extra spaces and standardizing capitalization.
Inconsistent country abbreviations (e.g., 'UK', 'USA') in column C, which can cause issues with geo-mapping in BI tools. Utilizes IF formula to standardize country names to their full names to ensure compatibility with geo-mapping in BI tools.
Free-form text responses in column J makes it difficult to quickly identify actionable feedback. Utilizes REGEXMATCH formula to categorize text responses into feedback action levels based on keyword matching.